CLSI: A Flexible Approximation Scheme from Clustered Term-Document Matrices

نویسندگان

  • Efstratios Gallopoulos
  • Dimitrios Zeimpekis
چکیده

We investigate a methodology for matrix approximation and IR. A central feature of these techniques is an initial clustering phase on the columns of the term-document matrix, followed by partial SVD on the columns constituting each cluster. The extracted information is used to build effective low rank approximations to the original matrix as well as for IR. The algorithms can be expressed by means of rank reduction formulas. Experiments indicate that these methods can achieve good overall performance for matrix approximation and IR and compete well with existing schemes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Function Approximation Approach for Robust Adaptive Control of Flexible joint Robots

This paper is concerned with the problem of designing a robust adaptive controller for flexible joint robots (FJR). Under the assumption of weak joint elasticity, FJR is firstly modeled and converted into singular perturbation form. The control law consists of a FAT-based adaptive control strategy and a simple correction term. The first term of the controller is used to stability of the slow dy...

متن کامل

Structure Preserving Dimension Reduction for Clustered Text Data Based on the Generalized Singular Value Decomposition

In today’s vector space information retrieval systems, dimension reduction is imperative for efficiently manipulating the massive quantity of data. To be useful, this lower-dimensional representation must be a good approximation of the full document set. To that end, we adapt and extend the discriminant analysis projection used in pattern recognition. This projection preserves cluster structure...

متن کامل

Clustered Matrix Approximation

In this paper we develop a novel clustered matrix approximation framework, first showing the motivation behind our research. The proposed methods are particularly well suited for problems with large scale sparse matrices that represent graphs and/or bipartite graphs from information science applications. Our framework and resulting approximations have a number of benefits: (1) the approximation...

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

Operational matrices with respect to Hermite polynomials and their applications in solving linear differential equations with variable coefficients

In this paper, a new and efficient approach is applied for numerical approximation of the linear differential equations with variable coeffcients based on operational matrices with respect to Hermite polynomials. Explicit formulae which express the Hermite expansion coeffcients for the moments of derivatives of any differentiable function in terms of the original expansion coefficients of the f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005